Skip to content

Support multi dss shard#265

Merged
lzxddz merged 2 commits intoeloqdata:mainfrom
lzxddz:dss_multi_shard-rebase
Nov 21, 2025
Merged

Support multi dss shard#265
lzxddz merged 2 commits intoeloqdata:mainfrom
lzxddz:dss_multi_shard-rebase

Conversation

@lzxddz
Copy link
Collaborator

@lzxddz lzxddz commented Nov 6, 2025

Summary by CodeRabbit

  • New Features

    • Simplified startup for bootstrap and single-node deployments; topology now supports a local solo-node path.
    • Slot handling switched to per-slot granularity for finer routing and moved-error reporting.
  • Bug Fixes

    • Relaxed TTL/PTTL test timing to tolerate clock skew between nodes.
  • Chores

    • Removed a deprecated build configuration option and related build flags.
    • Updated recorded subproject references.
  • API

    • Public initialization and client APIs adjusted to reflect bootstrap/single-node semantics.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 6, 2025

Walkthrough

This PR changes Redis/DSS bootstrap vs multi-node initialization, converts bucket-based mapping to per-slot (4096) addressing, simplifies DataStoreServiceClient/StartService APIs, renames GenerateMovedErrorMessage parameter, removes the USE_ONE_ELOQDSS_PARTITION_ENABLED CMake option and CI flags, and updates two submodule pointers.

Changes

Cohort / File(s) Summary
Bootstrap & topology init
src/redis_service.cpp
Add explicit bootstrap branch that builds a solo local node config (node_id 0) and derives ng_leaders locally; bypass ReadClusterConfigFile/ParseNgConfig in bootstrap/single-node paths and adjust DSS topology loading and peer-fetch logic accordingly.
DataStore client & API signatures
src/redis_service.cpp, include/redis_service.h
Expand EloqDS::DataStoreServiceClient constructor to accept bootstrap and peer-presence booleans; update TxConfigsToDssClusterConfig(...) signature to (node_id, ng_configs, ng_leaders, ds_config); simplify DataStoreService::StartService(...) usage to a single bootstrap/single-node bool and adjust call sites.
Slot mapping & MOVED errors
src/redis_service.cpp, include/redis_service.h, src/redis_command.cpp
Replace per-bucket terminology with per-slot (4096) granularity: GetNodeSlotsInfo expands buckets to per-slot ids; ClusterKeySlotCommand now uses key_.Hash() & 0x3fff; rename GenerateMovedErrorMessage(uint16_t slot_num)(uint16_t slot_id) and update all call sites to pass slot_id.
Build flag & CI script removals
CMakeLists.txt, concourse/scripts/build_tarball.bash, concourse/scripts/common.sh
Remove the USE_ONE_ELOQDSS_PARTITION_ENABLED CMake option and strip -DUSE_ONE_ELOQDSS_PARTITION_ENABLED=... flags from build/CI scripts.
Submodule bumps
manifest entries (store_handler, tx_service)
Update store_handler and tx_service submodule commit pointers in manifest (metadata-only submodule reference updates).
Tests
tests/unit/mono/expire.tcl
Relax PTTL assertion upper bound from <= 1000 to <= 1100 and add comments explaining possible clock skew between client and key-location nodes.

Sequence Diagram(s)

sequenceDiagram
    participant Redis as RedisServiceImpl::Init
    participant Config as ClusterConfig
    participant DS as DataStoreService
    participant Client as DataStoreServiceClient

    rect rgb(230,245,255)
    Note over Redis,Config: Bootstrap path (bootstrap=true)
    Redis->>Config: construct solo ng_configs (local node_id=0)
    Config->>Config: derive ng_leaders from local config
    Redis->>DS: StartService(bootstrap=true)
    DS->>Client: ctor(bootstrap=true, peer_present=...)
    end

    rect rgb(245,230,255)
    Note over Redis,Config: Non-bootstrap path (bootstrap=false)
    Redis->>Config: ReadClusterConfigFile / ParseNgConfig
    Config->>Config: build multi-node topology
    Redis->>DS: StartService(bootstrap=false)
    DS->>Client: ctor(bootstrap=false, peer_present=...)
    end
Loading
sequenceDiagram
    participant Cmd as ClusterKeySlotCommand
    participant Hash as Hash/Slot calc
    participant Err as GenerateMovedErrorMessage

    Cmd->>Hash: key.Hash()
    Hash->>Hash: bucket = hash & 0x3fff
    Hash->>Hash: map bucket → per-slot slot_id(s) (0..4095)
    Hash->>Err: GenerateMovedErrorMessage(slot_id)
    Err->>Cmd: reply MOVED with slot_id
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • Review bootstrap vs non-bootstrap branching and correctness of derived tmp_node_id/ng_leaders in src/redis_service.cpp.
  • Verify all updated EloqDS::DataStoreServiceClient constructor call sites and DataStoreService::StartService uses.
  • Validate TxConfigsToDssClusterConfig argument order and semantics (node_id vs ng_leaders).
  • Check slot expansion/masking (0..4095) and MOVED error payloads for off-by-one or range errors.

Possibly related PRs

Suggested reviewers

  • liunyl
  • xiexiaoy

Poem

🐇 I hopped through hashes, found slots galore,

From buckets to four thousand and more.
Bootstrap snug in a single-node den,
Flags tucked away, submodules bumped then—
A rabbit cheers this tidy code shore!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Support multi dss shard' refers to a real aspect of the changeset, specifically the multi-shard Data Store Service topology changes, but does not fully capture the main changes which also include significant parameter renames, signature updates, and bootstrap path refactoring.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between da776d0 and b996388.

📒 Files selected for processing (3)
  • src/redis_service.cpp (2 hunks)
  • store_handler (1 hunks)
  • tx_service (1 hunks)
🔇 Additional comments (3)
store_handler (1)

1-1: Verify submodule commit and alignment with PR objectives.

The submodule pointer is being updated as part of the multi-DSS shard support refactoring. Please confirm that:

  1. The new commit (776a8792d67354ea827560fa0aaa37cdabdf98f3) exists and is from the correct store_handler repository.
  2. The changes in that commit are compatible with the DSS topology initialization refactoring described in this PR (specifically the runtime-constructed ng_leaders mappings and bootstrap mode handling).
  3. Integration testing has validated that store_handler changes work correctly with the updated DataStoreService/DataStoreServiceClient constructors and the new topology initialization flow.
tx_service (1)

1-1: Verify submodule compatibility with multi-shard DSS topology changes.

This is a submodule pointer update for tx_service. Since the PR introduces changes to DataStoreService/DataStoreServiceClient constructors and moves topology initialization to runtime-driven ng_leaders mappings, please confirm that the new tx_service commit (c7e7581e402181f9ef2ed651c6017233288f6b41) is:

  1. Compatible with the updated constructor signatures
  2. Tested with the new multi-shard topology initialization logic
  3. A stable or well-tested commit (consider documenting any significant API changes or migration notes)
src/redis_service.cpp (1)

1039-1050: Verify behavior with empty ng_leaders in multi-node non-bootstrap scenarios.

The ng_leaders map is only populated when FLAGS_bootstrap || is_single_node is true (lines 1040-1046), but it's passed to TxConfigsToDssClusterConfig unconditionally (lines 1048-1049). This means in multi-node non-bootstrap scenarios, an empty ng_leaders map is passed. Please confirm whether this is the intended behavior and that TxConfigsToDssClusterConfig correctly handles empty leadership mappings for normal multi-node operation.

@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch from b996388 to 07d1760 Compare November 7, 2025 07:33
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b996388 and 07d1760.

📒 Files selected for processing (3)
  • src/redis_service.cpp (2 hunks)
  • store_handler (1 hunks)
  • tx_service (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • store_handler
  • tx_service
🧰 Additional context used
🧬 Code graph analysis (1)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (2)
src/redis_service.cpp (2)

1083-1095: Follow up on past review: Verify the updated initialization logic is correct.

The past review comment flagged unused variables init_ng_leaders and bootstrap_shards. The current implementation now uses ng_leaders (line 1039) and passes it to TxConfigsToDssClusterConfig (line 1050-1051), and the simplified constructor is used (lines 1091-1095).

However, please verify:

  1. The ng_leaders logic correctly replaces the old init_ng_leaders initialization
  2. The simplified StartService call (lines 1083-1084) matches the intended multi-shard support design
  3. The DataStoreServiceClient constructor (lines 1091-1095) correctly uses ds_config which was populated by TxConfigsToDssClusterConfig

Based on past review comment and current implementation.


1033-1035: Good addition: SetThisNode after fetching config from peer.

Calling SetThisNode to identify the local node within the fetched cluster configuration is a clean approach to integrate with an existing cluster.

@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch from 07d1760 to 9eceec5 Compare November 7, 2025 11:40
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/redis_service.cpp (1)

2172-2192: Guard against division-by-zero when ng_leaders is empty in GetNodeSlotsInfo.

node_group_cnt = ng_leaders.size() can be 0; reserve(bucket_owners.size() / node_group_cnt + 1) then divides by zero. Add an early return (or treat as config error).

Apply this minimal fix:

@@
-    std::unordered_map<NodeId, std::vector<uint16_t>> owner_to_bucket_ids;
-    uint32_t node_group_cnt = static_cast<uint32_t>(ng_leaders.size());
+    std::unordered_map<NodeId, std::vector<uint16_t>> owner_to_bucket_ids;
+    uint32_t node_group_cnt = static_cast<uint32_t>(ng_leaders.size());
+    if (node_group_cnt == 0) {
+        DLOG(WARNING) << "GetNodeSlotsInfo called with empty ng_leaders; no slot mapping produced.";
+        return;
+    }
@@
-            owner_to_bucket_ids.at(leader_node_id)
-                .reserve(bucket_owners.size() / node_group_cnt + 1);
+            owner_to_bucket_ids.at(leader_node_id)
+                .reserve(bucket_owners.size() / node_group_cnt + 1);
@@
-            owner_to_bucket_ids.at(leader_node_id)
-                .reserve(bucket_owners.size() / node_group_cnt + 1 +
-                         current_size);
+            owner_to_bucket_ids.at(leader_node_id)
+                .reserve(bucket_owners.size() / node_group_cnt + 1 +
+                         current_size);
🧹 Nitpick comments (1)
src/redis_service.cpp (1)

2211-2240: Avoid magic numbers and harden slot range construction.

Replace 1024/16 literals with named constants and initialize end_slot_range for robustness (single-slot edge).

Apply this refactor:

-        // map buckets to slots
-        assert(txservice::total_range_buckets == 1024);
-        std::vector<uint32_t> slot_ids;
-        // slots count is 16 times of bucket count
-        slot_ids.reserve(bucket_ids.size() * 16);
-        for (uint16_t bucket_id : bucket_ids)
-        {
-            for (int i = 0; i < 16; i++)
-            {
-                slot_ids.push_back(bucket_id + i * 1024);
-            }
-        }
+        // Map buckets to slots (Redis: 16384 slots, 1024 buckets → 16 slots/bucket)
+        constexpr uint32_t kBucketsPerRange = txservice::total_range_buckets;
+        constexpr uint32_t kSlotsPerBucket = 16;
+        assert(kBucketsPerRange == 1024);
+        std::vector<uint32_t> slot_ids;
+        slot_ids.reserve(bucket_ids.size() * kSlotsPerBucket);
+        for (uint16_t bucket_id : bucket_ids) {
+            for (uint32_t i = 0; i < kSlotsPerBucket; ++i) {
+                slot_ids.push_back(bucket_id + i * kBucketsPerRange);
+            }
+        }
@@
-        auto res = nodes_slots.try_emplace(node_id);
-        std::vector<SlotPair> &slots_vec = res.first->second;
-        SlotPair *slot_pair = &(slots_vec.emplace_back());
-        slot_pair->start_slot_range = slot_ids.front();
+        auto res = nodes_slots.try_emplace(node_id);
+        std::vector<SlotPair> &slots_vec = res.first->second;
+        SlotPair *slot_pair = &(slots_vec.emplace_back());
+        slot_pair->start_slot_range = slot_ids.front();
+        slot_pair->end_slot_range = slot_pair->start_slot_range; // initialize

Optional: pre-sort bucket_ids once, then generate slot_ids in sorted order by iterating i then buckets to avoid sorting a 16× larger vector.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 07d1760 and 9eceec5.

📒 Files selected for processing (6)
  • CMakeLists.txt (0 hunks)
  • concourse/scripts/build_tarball.bash (1 hunks)
  • concourse/scripts/common.sh (1 hunks)
  • src/redis_service.cpp (4 hunks)
  • store_handler (1 hunks)
  • tx_service (1 hunks)
💤 Files with no reviewable changes (1)
  • CMakeLists.txt
✅ Files skipped from review due to trivial changes (1)
  • tx_service
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-09-26T05:07:45.185Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: concourse/scripts/build_tarball_open.bash:121-121
Timestamp: 2025-09-26T05:07:45.185Z
Learning: The open log service in build_tarball_open.bash only supports ROCKSDB as the single log state option, unlike the main eloq log service which supports multiple WITH_LOG_STATE options. Therefore, no WITH_LOG_STATE configuration flag is needed for the open log service build.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T09:00:44.250Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: build_log_service.cmake:103-118
Timestamp: 2025-09-24T09:00:44.250Z
Learning: The build_log_service.cmake file is specifically for building the log_service submodule which only supports ROCKSDB log state, unlike build_eloq_log_service.cmake which handles multiple WITH_LOG_STATE options (MEMORY, ROCKSDB, ROCKSDB_CLOUD_S3, ROCKSDB_CLOUD_GCS). Therefore, requiring RocksDB unconditionally in build_log_service.cmake is the correct behavior.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T08:53:57.928Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: src/redis_service.cpp:94-118
Timestamp: 2025-09-24T08:53:57.928Z
Learning: The LOG_STATE_TYPE_RKDB_CLOUD macro gating in src/redis_service.cpp includes `&& !defined(LOG_STATE_TYPE_RKDB)` condition intentionally to prevent conflicts when LOG_STATE_TYPE_RKDB is defined unconditionally in some builds (build_log_service.cmake) alongside cloud variants. The CMake configuration in build_eloq_log_service.cmake ensures mutual exclusivity of log state types.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T08:45:08.818Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: build_log_service.cmake:128-135
Timestamp: 2025-09-24T08:45:08.818Z
Learning: The LOG_STATE_TYPE_RKDB compile definition in build_log_service.cmake is kept unconditional for compatibility reasons, despite architectural concerns about conflicting with WITH_LOG_STATE paradigm.

Applied to files:

  • concourse/scripts/common.sh
🧬 Code graph analysis (1)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (9)
store_handler (1)

1-1: Verify the submodule pointer update in your local environment.

The sandbox environment couldn't access the store_handler submodule's commit history (both the old and new commits are not available in the remote). You'll need to manually verify locally that:

  1. The new commit hash (adda6517bec8363cd367580b6969c584229d39bd) is valid and accessible
  2. The changes align with the API signature updates (TxConfigsToDssClusterConfig, DataStoreServiceClient, DataStoreService::StartService) mentioned in this PR
  3. The submodule changes support the multi-shard DSS bootstrap logic refactoring

Run this in your local repository to inspect the changes:

cd store_handler
git log --oneline ce0e9317fd486a0e5a1342228ca4f2805d492daf..adda6517bec8363cd367580b6969c584229d39bd
git diff ce0e9317fd486a0e5a1342228ca4f2805d492daf..adda6517bec8363cd367580b6969c584229d39bd -- '*.h' '*.hpp'
concourse/scripts/common.sh (2)

142-201: Correct deprecation of single-partition flag in run_build.

Removing -DUSE_ONE_ELOQDSS_PARTITION_ENABLED=ON aligns with the PR goal to support multi-shard DSS. The cmake invocation remains well-formed and the conditional DSS build logic (lines 190–197) is unaffected.


203-217: Verify consistency: run_build_ent may lack the same flag removal.

The provided code for run_build_ent (lines 209–217) does not show the ~ mark indicating removal of -DUSE_ONE_ELOQDSS_PARTITION_ENABLED=ON, unlike run_build (line 154). The enriched summary states the flag was removed from both functions. Please confirm whether:

  1. The flag removal should also apply to run_build_ent, or
  2. The annotation is incomplete in the provided context.

If the removal should apply, ensure both functions have consistent cmake arguments to avoid build path divergence.

concourse/scripts/build_tarball.bash (1)

175-190: Correct deprecation of single-partition flag in DSS tarball build.

Removing -DUSE_ONE_ELOQDSS_PARTITION_ENABLED=OFF from the DSS cmake invocation is the correct counterpart to the removal in common.sh. The conditional logic (lines 175–179) correctly gates DSS builds to ELOQDSS_* types, and the simplified cmake command remains valid.

src/redis_service.cpp (5)

1055-1057: LGTM: Correctly sets local DS endpoint when using peer topology.

Calling SetThisNode after fetching from peer is the right place and order.


1114-1114: LGTM: StartService(true) for non-shared RocksDB.

Matches the “always create_if_missing” intent for local Rocks deployments.


1116-1118: LGTM: StartService bootstrap flag unified across backends.

Passing FLAGS_bootstrap || is_single_node is consistent with init semantics.


1126-1130: LGTM: DataStoreServiceClient signature update applied correctly.

Constructor now takes bootstrap/single-node flag first; args order looks right.


1061-1074: Confirm whether non-bootstrap multi-node path requires ng_leaders population and ThisNode initialization.

The code structure validates the review concerns:

  • In non-bootstrap scenarios where FLAGS_bootstrap and is_single_node are both false, ng_leaders remains empty when passed to TxConfigsToDssClusterConfig.
  • ds_config.SetThisNode() is only called in the peer branch (lines 1055–1057); it is never invoked in the else branch where TxConfigsToDssClusterConfig is called.
  • Line 2172 uses ng_leaders.size() to compute node_group_cnt, suggesting non-empty leaders are expected.

Verify whether TxConfigsToDssClusterConfig handles empty ng_leaders gracefully, or whether this code path should populate ng_leaders with a safe default and explicitly set ThisNode before the call.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/redis_service.cpp (1)

2162-2191: Division‑by‑zero if ng_leaders is empty.

node_group_cnt = ng_leaders.size() can be 0; subsequent reserve(bucket_owners.size() / node_group_cnt + 1) divides by zero. Add an early return (or error) when ng_leaders.empty().

Minimal safe fix:

 void RedisServiceImpl::GetNodeSlotsInfo(
     const std::unordered_map<NodeGroupId, NodeId> &ng_leaders,
     std::unordered_map<NodeId, std::vector<SlotPair>> &nodes_slots) const
 {
     LocalCcShards *local_cc_shards =
         txservice::Sharder::Instance().GetLocalCcShards();
     auto native_ng = txservice::Sharder::Instance().NativeNodeGroup();
     std::vector<std::pair<uint16_t, NodeGroupId>> bucket_owners =
         local_cc_shards->GetAllBucketOwners(native_ng);
 
+    if (ng_leaders.empty())
+    {
+        // No leaders known → no slot ownership info can be computed.
+        nodes_slots.clear();
+        return;
+    }
+
     // split buckets_in_ng for all ngs' leader node and sort bucket_id for each
     // ng leader node
     std::unordered_map<NodeId, std::vector<uint16_t>> owner_to_bucket_ids;
     uint32_t node_group_cnt = static_cast<uint32_t>(ng_leaders.size());

Optionally, also guard the denominator where reserve(...) uses node_group_cnt.

♻️ Duplicate comments (1)
src/redis_service.cpp (1)

1061-1074: Guard no‑peer, non‑bootstrap, multi‑node startup (ng_leaders empty).

This path can call TxConfigsToDssClusterConfig with an empty ng_leaders map when !FLAGS_bootstrap && !is_single_node, which was previously flagged. Add validation and fail fast (or supply a fallback) to avoid inconsistent DSS topology initialization.

Suggested fix (fail fast with a clear message):

         else
         {
-            std::unordered_map<uint32_t, uint32_t> ng_leaders;
-            if (FLAGS_bootstrap || is_single_node)
+            std::unordered_map<uint32_t, uint32_t> ng_leaders;
+            if (FLAGS_bootstrap || is_single_node)
             {
                 // For bootstrap, start all data store shards in current
                 // process.
                 for (const auto &[ng_id, ng_config] : ng_configs)
                 {
                     ng_leaders.emplace(ng_id, node_id);
                 }
             }
 
             EloqDS::DataStoreServiceClient::TxConfigsToDssClusterConfig(
-                node_id, ng_configs, ng_leaders, ds_config);
+                node_id, ng_configs, ng_leaders, ds_config);
+
+            if (ng_leaders.empty() && !FLAGS_bootstrap && !is_single_node)
+            {
+                LOG(ERROR) << "No peer configured and not in bootstrap/single-node: "
+                           << "unable to derive ng_leaders/topology. "
+                           << "Set --eloq_dss_peer_node or run with --bootstrap.";
+                return false;
+            }
         }
🧹 Nitpick comments (1)
src/redis_service.cpp (1)

2210-2239: Slot mapping refactor is correct; remove magic numbers and set end initially.

  • Use txservice::total_range_buckets instead of literal 1024.
  • Define kSlotsPerBucket instead of literal 16.
  • Initialize end_slot_range to start_slot_range for robustness.
-        // map buckets to slots
-        assert(txservice::total_range_buckets == 1024);
-        std::vector<uint32_t> slot_ids;
-        // slots count is 16 times of bucket count
-        slot_ids.reserve(bucket_ids.size() * 16);
-        for (uint16_t bucket_id : bucket_ids)
-        {
-            for (int i = 0; i < 16; i++)
-            {
-                slot_ids.push_back(bucket_id + i * 1024);
-            }
-        }
+        // map buckets to slots
+        constexpr uint32_t kSlotsPerBucket = 16;  // 16384 / 1024
+        static_assert(txservice::total_range_buckets == 1024,
+                      "Update slot expansion if bucket count changes.");
+        std::vector<uint32_t> slot_ids;
+        slot_ids.reserve(bucket_ids.size() * kSlotsPerBucket);
+        for (uint16_t bucket_id : bucket_ids)
+        {
+            for (uint32_t i = 0; i < kSlotsPerBucket; ++i)
+            {
+                slot_ids.push_back(bucket_id + i * txservice::total_range_buckets);
+            }
+        }
 
         std::sort(slot_ids.begin(), slot_ids.end());
 
         auto res = nodes_slots.try_emplace(node_id);
         std::vector<SlotPair> &slots_vec = res.first->second;
         SlotPair *slot_pair = &(slots_vec.emplace_back());
-        slot_pair->start_slot_range = slot_ids.front();
+        slot_pair->start_slot_range = slot_ids.front();
+        slot_pair->end_slot_range = slot_ids.front();
 
         for (size_t idx = 1; idx < slot_ids.size(); idx++)
         {
             if (slot_ids[idx - 1] + 1 != slot_ids[idx])
             {
                 slot_pair = &(slots_vec.emplace_back());
                 slot_pair->start_slot_range = slot_ids[idx];
             }
             slot_pair->end_slot_range = slot_ids[idx];
         }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9eceec5 and dbbdf89.

📒 Files selected for processing (1)
  • src/redis_service.cpp (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (3)
src/redis_service.cpp (3)

1055-1057: Peer path self-identification looks correct.

Setting ds_config.SetThisNode(local_ip, TxPort2DssPort(local_tx_port)) after fetching from peer is the right place and aligns with the revised bootstrap flow.


1113-1117: StartService() boolean usage is consistent with new API.

  • RocksDB path: true to always create if missing (non-shared) — OK.
  • Other paths: FLAGS_bootstrap || is_single_node — matches intent.

1126-1129: Cannot locate TxConfigsToDssClusterConfig implementation to verify current node is set.

The search confirmed two code paths exist:

  • Peer path (line 1055): explicitly calls ds_config.SetThisNode(...)
  • Non-peer path (line 1072–1073): calls TxConfigsToDssClusterConfig(node_id, ng_configs, ng_leaders, ds_config)

However, the implementation of TxConfigsToDssClusterConfig could not be found in the codebase search results. It may reside in an external library, header-only code, or be inaccessible to the search. Without verifying the implementation, I cannot definitively confirm whether TxConfigsToDssClusterConfig internally calls SetThisNode on the ds_config object.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/redis_service.cpp (1)

2171-2191: Division-by-zero when ng_leaders is empty

node_group_cnt = ng_leaders.size() can be 0, and reserve(bucket_owners.size() / node_group_cnt + 1) will divide by zero. Guard early and return (or error) if ng_leaders is empty. This also aligns with earlier review feedback.

Apply this minimal fix:

@@ void RedisServiceImpl::GetNodeSlotsInfo(
-    std::unordered_map<NodeId, std::vector<uint16_t>> owner_to_bucket_ids;
-    uint32_t node_group_cnt = static_cast<uint32_t>(ng_leaders.size());
+    std::unordered_map<NodeId, std::vector<uint16_t>> owner_to_bucket_ids;
+    uint32_t node_group_cnt = static_cast<uint32_t>(ng_leaders.size());
+    if (node_group_cnt == 0) {
+        DLOG(WARNING) << "GetNodeSlotsInfo: ng_leaders is empty; no slot mapping produced.";
+        return;
+    }
🧹 Nitpick comments (1)
src/redis_service.cpp (1)

1061-1074: Clarify empty ng_leaders semantics in non-bootstrap, multi-node startup

In the no-peer path, ng_leaders stays empty unless bootstrap/single-node. Please confirm TxConfigsToDssClusterConfig can handle an empty leadership map and that nothing downstream assumes leaders exist at this point. If empty means “leadership unresolved,” consider logging an explicit INFO to aid ops.

Would you like a guard + log here if ng_leaders.empty() and !FLAGS_bootstrap && !is_single_node?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dbbdf89 and 74530e8.

📒 Files selected for processing (1)
  • src/redis_service.cpp (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (3)
src/redis_service.cpp (3)

1055-1058: SetThisNode after fetching peer topology — LGTM

Setting ThisNode only after a successful FetchConfigFromPeer is correct and avoids stale self-identification.


1113-1117: StartService flags — LGTM

Always creating for RocksDB (true) and deferring to (FLAGS_bootstrap || is_single_node) for shared storage matches the comment and expected safety.


1126-1129: DataStoreServiceClient ctor update — LGTM

Passing the bootstrap flag into the client aligns with the new API surface.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74530e8 and fa38dd9.

📒 Files selected for processing (6)
  • CMakeLists.txt (0 hunks)
  • concourse/scripts/build_tarball.bash (1 hunks)
  • concourse/scripts/common.sh (3 hunks)
  • src/redis_service.cpp (5 hunks)
  • store_handler (1 hunks)
  • tx_service (1 hunks)
💤 Files with no reviewable changes (1)
  • CMakeLists.txt
✅ Files skipped from review due to trivial changes (1)
  • tx_service
🚧 Files skipped from review as they are similar to previous changes (2)
  • concourse/scripts/build_tarball.bash
  • concourse/scripts/common.sh
🧰 Additional context used
🧬 Code graph analysis (1)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (4)
store_handler (1)

1-1: Unable to access submodule commits for verification; manual review required.

The submodule commits (ce0e9317...→7a9459481...) cannot be inspected in this environment due to remote access restrictions. To verify the submodule changes:

  • Clone the repository locally and run: cd store_handler && git log ce0e9317fd486a0e5a1342228ca4f2805d492daf..7a9459481634fc344ac7b2c63e4156b0e22f187a --stat to see what changed
  • Check the submodule's commit history for breaking API changes (especially headers/interfaces)
  • Confirm the changes align with the PR's "multi-dss shard" support goals
  • Ensure any API updates mentioned in the parent PR are reflected in the submodule
src/redis_service.cpp (3)

1192-1197: Updated DataStoreServiceClient ctor usage looks correct

Constructor args align with the new signature and bootstrap/single-node flag. LGTM.


2277-2306: Slot range assembly: single‑slot case now initialized correctly

Initializing end_slot_range alongside start_slot_range fixes the previously reported edge case. The range stitching loop is correct. LGTM.


1066-1069: Concern cannot be fully verified from codebase analysis alone

The review identifies two initialization paths that are handled differently:

  • Peer path (line 1056-1069): FetchConfigFromPeer() followed by explicit SetThisNode()
  • Non-peer path (lines 1070-1141): TxConfigsToDssClusterConfig() called in both bootstrap (line 1125) and non-bootstrap (line 1138) branches

Both non-peer paths correctly derive the node_id (either tmp_node_id or node_id) by matching against local_ip and local_tx_port. However, the implementation of TxConfigsToDssClusterConfig() is not accessible in the repository, making it impossible to confirm whether it properly initializes the this-node metadata in DataStoreServiceClusterManager as the peer path does via SetThisNode().

@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch 3 times, most recently from 9a4ccf5 to 14981a1 Compare November 11, 2025 10:40
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
src/redis_service.cpp (2)

2242-2261: Critical: division-by-zero when ng_leaders is empty.

node_group_cnt = ng_leaders.size() can be 0; subsequent reserve(.../node_group_cnt) divides by zero and is UB. Guard the empty case.

 void RedisServiceImpl::GetNodeSlotsInfo(
     const std::unordered_map<NodeGroupId, NodeId> &ng_leaders,
     std::unordered_map<NodeId, std::vector<SlotPair>> &nodes_slots) const
 {
@@
-    uint32_t node_group_cnt = static_cast<uint32_t>(ng_leaders.size());
+    uint32_t node_group_cnt = static_cast<uint32_t>(ng_leaders.size());
+    if (node_group_cnt == 0) {
+        // Nothing to assign; avoid UB on division by zero.
+        return;
+    }
@@
                 .reserve(bucket_owners.size() / node_group_cnt + 1);
@@
                 .reserve(bucket_owners.size() / node_group_cnt + 1 +
                          current_size);
#!/bin/bash
# Expect: no occurrences of division by zero introduced by size() usage hereafter.
rg -n -C2 'bucket_owners\.size\(\)\s*/\s*node_group_cnt' src/redis_service.cpp

1102-1118: Bug: Iterating the wrong container when searching tmp_node_id in bootstrap path.

This loop should scan tmp_ng_configs (freshly loaded) but iterates ng_configs. It can fail to find the local node or pick a wrong node_id.

Apply fix:

-                bool found = false;
-                uint32_t tmp_node_id = 0;
-                for (auto &pair : ng_configs)
+                bool found = false;
+                uint32_t tmp_node_id = 0;
+                for (const auto &pair : tmp_ng_configs)
                 {
                     auto &ng_nodes = pair.second;
                     for (size_t i = 0; i < ng_nodes.size(); i++)
                     {
                         if (ng_nodes[i].host_name_ == local_ip &&
                             ng_nodes[i].port_ == local_tx_port)
                         {
                             tmp_node_id = ng_nodes[i].node_id_;
                             found = true;
-                            break;
+                            break;
                         }
                     }
+                    if (found) break;
                 }
🧹 Nitpick comments (1)
src/redis_service.cpp (1)

2281-2310: Optional: avoid 16× expansion and sort; emit ranges layer-by-layer.

Current approach builds a 16N vector and sorts it. You can sort bucket_ids once and emit contiguous slot ranges per “layer” (i in [0..15]) for O(N log N + 16N) time and O(N) memory.

Illustrative replacement for the mapping/range build:

-        // map buckets to slots
-        assert(txservice::total_range_buckets == 1024);
-        std::vector<uint32_t> slot_ids;
-        // slots count is 16 times of bucket count
-        slot_ids.reserve(bucket_ids.size() * 16);
-        for (uint16_t bucket_id : bucket_ids)
-        {
-            for (int i = 0; i < 16; i++)
-            {
-                slot_ids.push_back(bucket_id + i * 1024);
-            }
-        }
-        std::sort(slot_ids.begin(), slot_ids.end());
-
-        auto res = nodes_slots.try_emplace(node_id);
-        std::vector<SlotPair> &slots_vec = res.first->second;
-        SlotPair *slot_pair = &(slots_vec.emplace_back());
-        slot_pair->start_slot_range = slot_ids.front();
-        slot_pair->end_slot_range = slot_ids.front();
-        for (size_t idx = 1; idx < slot_ids.size(); idx++)
-        {
-            if (slot_ids[idx - 1] + 1 != slot_ids[idx])
-            {
-                slot_pair = &(slots_vec.emplace_back());
-                slot_pair->start_slot_range = slot_ids[idx];
-            }
-            slot_pair->end_slot_range = slot_ids[idx];
-        }
+        std::sort(bucket_ids.begin(), bucket_ids.end());
+        auto res = nodes_slots.try_emplace(node_id);
+        std::vector<SlotPair> &slots_vec = res.first->second;
+        // Each bucket maps to 16 slots: bucket + i*1024
+        for (int layer = 0; layer < 16; ++layer) {
+            size_t start_idx = 0;
+            while (start_idx < bucket_ids.size()) {
+                uint32_t range_start = bucket_ids[start_idx] + layer * 1024;
+                uint32_t range_end = range_start;
+                size_t end_idx = start_idx;
+                while (end_idx + 1 < bucket_ids.size()
+                       && bucket_ids[end_idx + 1] == bucket_ids[end_idx] + 1) {
+                    ++end_idx;
+                    range_end = bucket_ids[end_idx] + layer * 1024;
+                }
+                SlotPair &sp = slots_vec.emplace_back();
+                sp.start_slot_range = range_start;
+                sp.end_slot_range = range_end;
+                start_idx = end_idx + 1;
+            }
+        }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fa38dd9 and 697761e.

📒 Files selected for processing (6)
  • CMakeLists.txt (0 hunks)
  • concourse/scripts/build_tarball.bash (1 hunks)
  • concourse/scripts/common.sh (3 hunks)
  • src/redis_service.cpp (5 hunks)
  • store_handler (1 hunks)
  • tx_service (1 hunks)
💤 Files with no reviewable changes (1)
  • CMakeLists.txt
🚧 Files skipped from review as they are similar to previous changes (3)
  • concourse/scripts/build_tarball.bash
  • store_handler
  • tx_service
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-09-26T05:07:45.185Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: concourse/scripts/build_tarball_open.bash:121-121
Timestamp: 2025-09-26T05:07:45.185Z
Learning: The open log service in build_tarball_open.bash only supports ROCKSDB as the single log state option, unlike the main eloq log service which supports multiple WITH_LOG_STATE options. Therefore, no WITH_LOG_STATE configuration flag is needed for the open log service build.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T09:00:44.250Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: build_log_service.cmake:103-118
Timestamp: 2025-09-24T09:00:44.250Z
Learning: The build_log_service.cmake file is specifically for building the log_service submodule which only supports ROCKSDB log state, unlike build_eloq_log_service.cmake which handles multiple WITH_LOG_STATE options (MEMORY, ROCKSDB, ROCKSDB_CLOUD_S3, ROCKSDB_CLOUD_GCS). Therefore, requiring RocksDB unconditionally in build_log_service.cmake is the correct behavior.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T08:53:57.928Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: src/redis_service.cpp:94-118
Timestamp: 2025-09-24T08:53:57.928Z
Learning: The LOG_STATE_TYPE_RKDB_CLOUD macro gating in src/redis_service.cpp includes `&& !defined(LOG_STATE_TYPE_RKDB)` condition intentionally to prevent conflicts when LOG_STATE_TYPE_RKDB is defined unconditionally in some builds (build_log_service.cmake) alongside cloud variants. The CMake configuration in build_eloq_log_service.cmake ensures mutual exclusivity of log state types.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T08:45:08.818Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: build_log_service.cmake:128-135
Timestamp: 2025-09-24T08:45:08.818Z
Learning: The LOG_STATE_TYPE_RKDB compile definition in build_log_service.cmake is kept unconditional for compatibility reasons, despite architectural concerns about conflicting with WITH_LOG_STATE paradigm.

Applied to files:

  • concourse/scripts/common.sh
🧬 Code graph analysis (1)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (9)
concourse/scripts/common.sh (3)

148-154: Remove the deprecated cmake flag from run_build.

The removal of -DUSE_ONE_ELOQDSS_PARTITION_ENABLED=ON aligns with the PR objective to support multi-dss-shard configurations. This change correctly removes the single-partition constraint. However, verify that all downstream code paths properly handle multi-shard initialization without this flag.


209-217: Verify the cmake flag removal in run_build_ent.

The -DUSE_ONE_ELOQDSS_PARTITION_ENABLED=ON flag removal is consistent with the changes in run_build. Confirm that the enterprise build path correctly handles multi-shard DSS topology without requiring this flag.


2970-2970: Clarify why these tests are commented out.

Two test invocations have been commented out in the run_eloq_test function for ELOQDSS_ROCKSDB_CLOUD_S3:

  • Line 2970: branching_snapshot_test.py
  • Line 2979: rocksdb_cloud_data_store_purger_test.py

Are these tests temporarily disabled pending fixes, or should they be removed entirely? If intentional, consider adding a comment explaining the reason (e.g., "Disabled until multi-shard snapshot logic is finalized").

Also applies to: 2979-2979

src/redis_service.cpp (6)

838-845: Bootstrap solo-node init looks correct.

Creating a single NG with the local node for FLAGS_bootstrap is consistent with the new bootstrap flow.

Also applies to: 846-865


1071-1073: Good: SetThisNode after fetching DSS topology.

Setting this node after FetchConfigFromPeer prevents stale identity; Tx→DSS port mapping is used correctly.


1184-1188: RocksDB path: StartService(true) matches new semantics.

Always creating DB for local storage is expected. LGTM.


1197-1200: Client ctor updated to pass bootstrap_flag: OK.

Constructor call matches the new signature and intent.


1212-1219: Connect gating is correct.

Connecting only in bootstrap/single-node avoids premature connects in clustered startup.


1076-1131: Bootstrap DSS config assembly overall is sound after the container fix.

Building tmp_ng_configs, resolving tmp_node_id, and setting ng_leaders per-NG for DSS bootstrap aligns with the multi-shard design.

Please confirm TxConfigsToDssClusterConfig gracefully handles empty ng_leaders in the non-single-node, non-bootstrap path (it’s invoked below with possibly empty ng_leaders).

Also applies to: 1133-1144

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
concourse/scripts/common.sh (1)

142-154: Flag removal verified; add tracking comments for disabled tests.

The -DUSE_ONE_ELOQDSS_PARTITION_ENABLED flag has been completely and consistently removed from both run_build() and run_build_ent() functions. However, multiple test cases in the run_eloq_test function (lines 2970–2979 area) are commented out without explanation or tracking:

  • branching_snapshot_test
  • rocksdb_cloud_data_store_purger_test
  • ttl tests

Add TODO comments or issue references to these disabled tests explaining whether they are known failures under the multi-shard model or technical debt to be re-enabled.

src/redis_service.cpp (1)

646-651: Typo in config validation log message.

Message mentions enable_io_uring twice; should reference raft_log_async_fsync.

-        LOG(ERROR) << "Invalid config: when set `enable_io_uring`, "
-                      "should also set `enable_io_uring`.";
+        LOG(ERROR) << "Invalid config: when set `raft_log_async_fsync`, "
+                      "should also set `enable_io_uring`.";
♻️ Duplicate comments (1)
src/redis_service.cpp (1)

1102-1118: Break outer loop once tmp_node_id is found (tiny efficiency nit).

Stop scanning tmp_ng_configs after the match; adds no behavior change, reduces work.

                 for (const auto &pair : tmp_ng_configs)
                 {
                     auto &ng_nodes = pair.second;
                     for (size_t i = 0; i < ng_nodes.size(); i++)
                     {
                         if (ng_nodes[i].host_name_ == local_ip &&
                             ng_nodes[i].port_ == local_tx_port)
                         {
                             tmp_node_id = ng_nodes[i].node_id_;
                             found = true;
                             break;
                         }
                     }
+                    if (found) break;
                 }
🧹 Nitpick comments (1)
src/redis_service.cpp (1)

2281-2311: Per-slot expansion and contiguous-range emission looks correct.

  • Properly initializes start/end; sorted slots guarantee correct range stitching.
  • Optional: replace magic constants with named constants for readability.
-        for (int i = 0; i < 16; i++)
+        constexpr int kSlotsPerBucket = 16;
+        constexpr int kBucketsPerLayer = 1024;
+        for (int i = 0; i < kSlotsPerBucket; i++)
         {
-            slot_ids.push_back(bucket_id + i * 1024);
+            slot_ids.push_back(bucket_id + i * kBucketsPerLayer);
         }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 697761e and 14981a1.

📒 Files selected for processing (6)
  • CMakeLists.txt (0 hunks)
  • concourse/scripts/build_tarball.bash (1 hunks)
  • concourse/scripts/common.sh (3 hunks)
  • src/redis_service.cpp (5 hunks)
  • store_handler (1 hunks)
  • tx_service (1 hunks)
💤 Files with no reviewable changes (1)
  • CMakeLists.txt
✅ Files skipped from review due to trivial changes (1)
  • tx_service
🚧 Files skipped from review as they are similar to previous changes (1)
  • concourse/scripts/build_tarball.bash
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-09-26T05:07:45.185Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: concourse/scripts/build_tarball_open.bash:121-121
Timestamp: 2025-09-26T05:07:45.185Z
Learning: The open log service in build_tarball_open.bash only supports ROCKSDB as the single log state option, unlike the main eloq log service which supports multiple WITH_LOG_STATE options. Therefore, no WITH_LOG_STATE configuration flag is needed for the open log service build.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T09:00:44.250Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: build_log_service.cmake:103-118
Timestamp: 2025-09-24T09:00:44.250Z
Learning: The build_log_service.cmake file is specifically for building the log_service submodule which only supports ROCKSDB log state, unlike build_eloq_log_service.cmake which handles multiple WITH_LOG_STATE options (MEMORY, ROCKSDB, ROCKSDB_CLOUD_S3, ROCKSDB_CLOUD_GCS). Therefore, requiring RocksDB unconditionally in build_log_service.cmake is the correct behavior.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T08:53:57.928Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: src/redis_service.cpp:94-118
Timestamp: 2025-09-24T08:53:57.928Z
Learning: The LOG_STATE_TYPE_RKDB_CLOUD macro gating in src/redis_service.cpp includes `&& !defined(LOG_STATE_TYPE_RKDB)` condition intentionally to prevent conflicts when LOG_STATE_TYPE_RKDB is defined unconditionally in some builds (build_log_service.cmake) alongside cloud variants. The CMake configuration in build_eloq_log_service.cmake ensures mutual exclusivity of log state types.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T08:45:08.818Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: build_log_service.cmake:128-135
Timestamp: 2025-09-24T08:45:08.818Z
Learning: The LOG_STATE_TYPE_RKDB compile definition in build_log_service.cmake is kept unconditional for compatibility reasons, despite architectural concerns about conflicting with WITH_LOG_STATE paradigm.

Applied to files:

  • concourse/scripts/common.sh
🧬 Code graph analysis (1)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (8)
store_handler (1)

1-1: Verify submodule integration with updated DataStoreServiceClient API via integration tests.

The sandbox environment cannot access the submodule's specific commit. Please manually verify that the new store_handler commit 19ea736a correctly integrates with these API changes:

  • DataStoreServiceClient constructor requires bootstrap_flag as first parameter
  • TxConfigsToDssClusterConfig signature changed to node_id-centric form
  • StartService signature simplified to StartService(bool)

Confirm integration tests validate the bootstrap flow across the updated APIs.

concourse/scripts/common.sh (1)

203-217: The review comment's stated concern appears to be moot—the -DUSE_ONE_ELOQDSS_PARTITION_ENABLED flag is already absent from both run_build() and run_build_ent() functions.

Verification results:

  • run_build() cmake: -DCMAKE_BUILD_TYPE, -DWITH_DATA_STORE, -DBUILD_WITH_TESTS=ON, -DWITH_LOG_SERVICE=ON
  • run_build_ent() cmake: Same as above, plus -DOPEN_LOG_SERVICE=OFF and -DFORK_HM_PROCESS=ON

The flag referenced in the review comment does not exist in either function, so there is no inconsistency to resolve. Note that run_build_ent() intentionally has two additional configuration flags (-DOPEN_LOG_SERVICE=OFF, -DFORK_HM_PROCESS=ON) that differentiate it from run_build(), which appears to be by design.

Likely an incorrect or invalid review comment.

src/redis_service.cpp (6)

838-845: Bootstrap solo-node initialization looks correct.

Solo ng_configs seeded with current node; clean fallback to file/flags otherwise.

Also applies to: 846-865


1070-1073: Good: SetThisNode after fetching peer topology.

This ordering avoids stale self-node metadata.


1124-1144: Bootstrap and single-node ng_leaders derivation is reasonable.

Assigning this node as leader in bootstrap/single-node fits the intended semantics.


1184-1188: StartService gating is appropriate.

RocksDB non-shared always true; others respect bootstrap||single-node.


1197-1200: Client constructor updated with bootstrap flag — OK.

Matches new API and call-site semantics.


1212-1219: Deferred Connect unless bootstrap/single-node — good.

Avoids premature connections before role/topology is known.

@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch 3 times, most recently from 79e2b47 to cd22760 Compare November 12, 2025 06:34
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/redis_service.cpp (1)

2285-2314: Avoid 16× expansion/sort for slots; emit ranges per layer directly

Current code builds and sorts a 16× larger vector per node. You can sort bucket_ids once and emit contiguous slot ranges per layer to cut allocations and O(16N log 16N) sorting.

Example replacement:

-        // map buckets to slots
-        assert(txservice::total_range_buckets == 1024);
-        std::vector<uint32_t> slot_ids;
-        slot_ids.reserve(bucket_ids.size() * 16);
-        for (uint16_t bucket_id : bucket_ids) {
-            for (int i = 0; i < 16; i++) {
-                slot_ids.push_back(bucket_id + i * 1024);
-            }
-        }
-        std::sort(slot_ids.begin(), slot_ids.end());
-        auto res = nodes_slots.try_emplace(node_id);
-        std::vector<SlotPair> &slots_vec = res.first->second;
-        SlotPair *slot_pair = &(slots_vec.emplace_back());
-        slot_pair->start_slot_range = slot_ids.front();
-        slot_pair->end_slot_range = slot_ids.front();
-        for (size_t idx = 1; idx < slot_ids.size(); idx++) {
-            if (slot_ids[idx - 1] + 1 != slot_ids[idx]) {
-                slot_pair = &(slots_vec.emplace_back());
-                slot_pair->start_slot_range = slot_ids[idx];
-            }
-            slot_pair->end_slot_range = slot_ids[idx];
-        }
+        assert(txservice::total_range_buckets == 1024);
+        std::sort(bucket_ids.begin(), bucket_ids.end());
+        auto res = nodes_slots.try_emplace(node_id);
+        std::vector<SlotPair> &slots_vec = res.first->second;
+        for (int layer = 0; layer < 16; ++layer) {
+            size_t i = 0;
+            while (i < bucket_ids.size()) {
+                uint32_t start = bucket_ids[i] + layer * 1024;
+                uint32_t end = start;
+                size_t j = i;
+                while (j + 1 < bucket_ids.size() &&
+                       bucket_ids[j + 1] == bucket_ids[j] + 1) {
+                    ++j;
+                    end = bucket_ids[j] + layer * 1024;
+                }
+                SlotPair &sp = slots_vec.emplace_back();
+                sp.start_slot_range = start;
+                sp.end_slot_range = end;
+                i = j + 1;
+            }
+        }

Optional: early‑return if ng_leaders.empty() to skip all work.

🧹 Nitpick comments (2)
include/redis_service.h (1)

242-242: LGTM – Parameter rename improves semantic clarity.

Renaming the parameter from slot_num to slot_id better conveys that it represents a slot identifier rather than a count, and aligns with the implementation in src/redis_service.cpp and the broader per-slot expansion in this PR.

src/redis_service.cpp (1)

2588-2612: Replace magic mask 0x3fff with a named constant

Use a constexpr (e.g., kRedisClusterSlotMask = 0x3fff) to avoid literals in two places and improve readability.

-                uint16_t slot_id = key->Hash() & 0x3fff;
+                constexpr uint16_t kRedisClusterSlotMask = 0x3fff; // 16383
+                uint16_t slot_id = key->Hash() & kRedisClusterSlotMask;
@@
-                uint16_t slot_id = keys->at(0).Hash() & 0x3fff;
+                constexpr uint16_t kRedisClusterSlotMask = 0x3fff; // 16383
+                uint16_t slot_id = keys->at(0).Hash() & kRedisClusterSlotMask;
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14981a1 and cd22760.

📒 Files selected for processing (8)
  • CMakeLists.txt (0 hunks)
  • concourse/scripts/build_tarball.bash (1 hunks)
  • concourse/scripts/common.sh (3 hunks)
  • include/redis_service.h (1 hunks)
  • src/redis_command.cpp (1 hunks)
  • src/redis_service.cpp (8 hunks)
  • store_handler (1 hunks)
  • tx_service (1 hunks)
💤 Files with no reviewable changes (1)
  • CMakeLists.txt
✅ Files skipped from review due to trivial changes (2)
  • src/redis_command.cpp
  • tx_service
🚧 Files skipped from review as they are similar to previous changes (1)
  • concourse/scripts/build_tarball.bash
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-09-26T05:07:45.185Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: concourse/scripts/build_tarball_open.bash:121-121
Timestamp: 2025-09-26T05:07:45.185Z
Learning: The open log service in build_tarball_open.bash only supports ROCKSDB as the single log state option, unlike the main eloq log service which supports multiple WITH_LOG_STATE options. Therefore, no WITH_LOG_STATE configuration flag is needed for the open log service build.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T09:00:44.250Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: build_log_service.cmake:103-118
Timestamp: 2025-09-24T09:00:44.250Z
Learning: The build_log_service.cmake file is specifically for building the log_service submodule which only supports ROCKSDB log state, unlike build_eloq_log_service.cmake which handles multiple WITH_LOG_STATE options (MEMORY, ROCKSDB, ROCKSDB_CLOUD_S3, ROCKSDB_CLOUD_GCS). Therefore, requiring RocksDB unconditionally in build_log_service.cmake is the correct behavior.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T08:53:57.928Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: src/redis_service.cpp:94-118
Timestamp: 2025-09-24T08:53:57.928Z
Learning: The LOG_STATE_TYPE_RKDB_CLOUD macro gating in src/redis_service.cpp includes `&& !defined(LOG_STATE_TYPE_RKDB)` condition intentionally to prevent conflicts when LOG_STATE_TYPE_RKDB is defined unconditionally in some builds (build_log_service.cmake) alongside cloud variants. The CMake configuration in build_eloq_log_service.cmake ensures mutual exclusivity of log state types.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T08:45:08.818Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: build_log_service.cmake:128-135
Timestamp: 2025-09-24T08:45:08.818Z
Learning: The LOG_STATE_TYPE_RKDB compile definition in build_log_service.cmake is kept unconditional for compatibility reasons, despite architectural concerns about conflicting with WITH_LOG_STATE paradigm.

Applied to files:

  • concourse/scripts/common.sh
🧬 Code graph analysis (2)
include/redis_service.h (1)
src/redis_service.cpp (2)
  • GenerateMovedErrorMessage (2501-2525)
  • GenerateMovedErrorMessage (2501-2501)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (6)
store_handler (1)

1-1: Submodule commit hash cannot be verified in this environment—manual verification required.

The submodule pointer update in store_handler cannot be verified because the old commit hash (f03dd0d1901e97450d32dd7adf7858e7673af387) is not accessible in the submodule repository. This could indicate the commit was rebased, removed from history, or the submodule has limited history available.

Ensure that:

  • The new commit hash 19ea736ac39b384d9ba0bf7c2af1fc04a2fc436a exists and contains the intended multi-shard DSS changes
  • The submodule changes are compatible with updates in the main repository files
  • All related API signature changes are properly integrated
concourse/scripts/common.sh (1)

154-154: LGTM – Removal of deprecated partition flag aligns with multi-shard architecture.

The removal of -DUSE_ONE_ELOQDSS_PARTITION_ENABLED=ON is consistent with the PR's architectural changes supporting multi-DSS shards and the removal of this option from CMakeLists.txt.

src/redis_service.cpp (4)

838-845: Solo‑node bootstrap path looks good

Using a single NG with only the local candidate simplifies bootstrap and avoids external config dependencies. No changes requested.


1060-1149: Ensure SetThisNode is applied when building ds_config from ng_configs (non‑peer path)

You call SetThisNode() only when config is fetched from a peer. For the local TxConfigsToDssClusterConfig path, please confirm it sets "this node" inside ds_config (IP + DSS port) or add an explicit SetThisNode() after constructing ds_config.

If needed, add after TxConfigsToDssClusterConfig:

@@
-                EloqDS::DataStoreServiceClient::TxConfigsToDssClusterConfig(
-                    node_id, ng_configs, ng_leaders, ds_config);
+                EloqDS::DataStoreServiceClient::TxConfigsToDssClusterConfig(
+                    node_id, ng_configs, ng_leaders, ds_config);
+                ds_config.SetThisNode(
+                    local_ip,
+                    EloqDS::DataStoreServiceClient::TxPort2DssPort(local_tx_port));

1188-1192: Double‑check StartService flags across backends

RocksDB path forces create_if_missing (true) — good for non‑shared storage. For other backends you use FLAGS_bootstrap || is_single_node. Please verify that StartService(false) in multi‑node, non‑bootstrap is the intended behavior (e.g., no shard creation, only service start), and that bootstrap/single‑node creates as expected.


2501-2512: LGTM: MOVED error now uses slot_id and per‑slot ranges

Signature and comparisons align with 0–16383 slot semantics. No changes requested.

@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch 2 times, most recently from fd9d2b0 to 4f9b323 Compare November 14, 2025 11:07
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/redis_service.cpp (1)

2247-2266: Guard against division-by-zero when ng_leaders is empty.

node_group_cnt can be 0, leading to reserve(bucket_owners.size() / node_group_cnt + 1). Add an early return.

Apply this diff:

@@ void RedisServiceImpl::GetNodeSlotsInfo(
-    uint32_t node_group_cnt = static_cast<uint32_t>(ng_leaders.size());
+    uint32_t node_group_cnt = static_cast<uint32_t>(ng_leaders.size());
+    if (node_group_cnt == 0) {
+        DLOG(WARNING) << "GetNodeSlotsInfo called with empty ng_leaders";
+        return;
+    }
♻️ Duplicate comments (2)
concourse/scripts/common.sh (1)

2970-2970: Add explanatory comments for deliberately commented-out tests.

Lines 2970 and 2979 comment out branching_snapshot_test and rocksdb_cloud_data_store_purger_test without inline explanation. As flagged in previous reviews, please clarify whether these tests are:

  1. Intentionally disabled due to incompatibility with multi-shard changes — if so, add a comment explaining why (e.g., "Disabled due to multi-shard slot mapping changes; re-enable when shards fully supported") and reference this PR.
  2. Temporarily disabled pending a fix — if so, add a TODO comment with a link to a tracking issue and expected re-enable timeline.

Without explanation, it's unclear whether these are known gaps or accidental omissions.

Also applies to: 2979-2979

src/redis_service.cpp (1)

1102-1122: Correct container used for tmp_node_id lookup (LGTM).

Switched to iterate tmp_ng_configs; breaks correctly once found.

🧹 Nitpick comments (1)
src/redis_service.cpp (1)

2286-2315: Per-slot mapping logic is correct; consider avoiding 16× expansion.

Current approach builds and sorts 16N slot_ids. You can emit ranges per 16 layers over sorted bucket_ids to reduce allocations and sort cost. I can provide a ready-to-drop helper if you want it.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd22760 and 4f9b323.

📒 Files selected for processing (9)
  • CMakeLists.txt (0 hunks)
  • concourse/scripts/build_tarball.bash (1 hunks)
  • concourse/scripts/common.sh (3 hunks)
  • include/redis_service.h (1 hunks)
  • src/redis_command.cpp (1 hunks)
  • src/redis_service.cpp (8 hunks)
  • store_handler (1 hunks)
  • tests/unit/mono/expire.tcl (1 hunks)
  • tx_service (1 hunks)
💤 Files with no reviewable changes (1)
  • CMakeLists.txt
✅ Files skipped from review due to trivial changes (1)
  • store_handler
🚧 Files skipped from review as they are similar to previous changes (3)
  • concourse/scripts/build_tarball.bash
  • tx_service
  • src/redis_command.cpp
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-09-26T05:07:45.185Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: concourse/scripts/build_tarball_open.bash:121-121
Timestamp: 2025-09-26T05:07:45.185Z
Learning: The open log service in build_tarball_open.bash only supports ROCKSDB as the single log state option, unlike the main eloq log service which supports multiple WITH_LOG_STATE options. Therefore, no WITH_LOG_STATE configuration flag is needed for the open log service build.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T09:00:44.250Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: build_log_service.cmake:103-118
Timestamp: 2025-09-24T09:00:44.250Z
Learning: The build_log_service.cmake file is specifically for building the log_service submodule which only supports ROCKSDB log state, unlike build_eloq_log_service.cmake which handles multiple WITH_LOG_STATE options (MEMORY, ROCKSDB, ROCKSDB_CLOUD_S3, ROCKSDB_CLOUD_GCS). Therefore, requiring RocksDB unconditionally in build_log_service.cmake is the correct behavior.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T08:53:57.928Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: src/redis_service.cpp:94-118
Timestamp: 2025-09-24T08:53:57.928Z
Learning: The LOG_STATE_TYPE_RKDB_CLOUD macro gating in src/redis_service.cpp includes `&& !defined(LOG_STATE_TYPE_RKDB)` condition intentionally to prevent conflicts when LOG_STATE_TYPE_RKDB is defined unconditionally in some builds (build_log_service.cmake) alongside cloud variants. The CMake configuration in build_eloq_log_service.cmake ensures mutual exclusivity of log state types.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T08:45:08.818Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: build_log_service.cmake:128-135
Timestamp: 2025-09-24T08:45:08.818Z
Learning: The LOG_STATE_TYPE_RKDB compile definition in build_log_service.cmake is kept unconditional for compatibility reasons, despite architectural concerns about conflicting with WITH_LOG_STATE paradigm.

Applied to files:

  • concourse/scripts/common.sh
🧬 Code graph analysis (2)
include/redis_service.h (1)
src/redis_service.cpp (2)
  • GenerateMovedErrorMessage (2502-2526)
  • GenerateMovedErrorMessage (2502-2502)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (10)
concourse/scripts/common.sh (2)

148-154: CMake flag removal aligns with multi-shard changes.

The removal of -DUSE_ONE_ELOQDSS_PARTITION_ENABLED=ON from the cmake invocation is consistent with the PR objectives to support multi-DSS sharding.


209-217: CMake flag removal in enterprise build is consistent.

The removal of -DUSE_ONE_ELOQDSS_PARTITION_ENABLED=ON from the run_build_ent cmake invocation mirrors the change in run_build, maintaining consistency across build configurations.

tests/unit/mono/expire.tcl (1)

156-160: Approve the relaxation, but consider verifying clock skew tolerance.

The change appropriately addresses clock skew in the multi-shard environment. When SETEX calculates expiration on one node and PTTL reads from another, a 100ms clock difference can cause PTTL to return up to 1100ms for a 1-second TTL.

However, there are a few concerns:

  1. Grammar: Fix the comment text:

    • Line 157: "Why ttl <= 1100 ?" (to match the assertion)
    • Line 157: "calculates" instead of "calculate" (twice on lines 157-158)
    • Line 158: "calculates" instead of "calculate"
  2. Clock skew magnitude: 100ms clock skew between nodes is significant. In production deployments with NTP, clock skew should typically be within a few milliseconds. Consider verifying whether this level of clock drift is expected in your deployment or if it indicates a clock synchronization issue.

  3. TODO tracking: The TODO on line 159 mentions unifying the calculation node. Consider tracking this as a follow-up task.

Apply this diff to fix the grammar:

-    # Why ttl < 1100 ? Because, SETEX calculate the expired timestamp on the node that client connected
-    # and PTTL calculate the expired timestamp on the node that key located. 
-    # The clock on the two nodes is not same. (TODO: the two command should calculate on same node)
+    # Why ttl <= 1100? Because SETEX calculates the expiration timestamp on the node that the client connected to,
+    # and PTTL calculates the expiration timestamp on the node where the key is located.
+    # The clocks on the two nodes may not be synchronized. (TODO: both commands should calculate on the same node)

Do you want me to verify the expected clock synchronization requirements for your deployment, or help track the TODO item?

include/redis_service.h (1)

242-242: Param rename aligns with per-slot semantics (LGTM).

Declaration matches implementation; consistent with slot_id usage.

src/redis_service.cpp (6)

838-865: Bootstrap solo-node path: config assembly looks correct.

Single NG with local candidate is fine for bootstrapping.


1070-1073: SetThisNode uses TxPort2DssPort — please confirm port mapping.

Ensure TxPort2DssPort(local_tx_port) yields the DSS RPC port expected by peers.

Would you confirm against the DSS networking docs?


1128-1148: Bootstrap ng_leaders synthesized to local node — confirm multi-NG intent.

Making this node leader for all NGs during bootstrap is reasonable; confirm no side effects when peers later join and elect leaders.


1188-1192: API changes: verify StartService(bool) semantics and client bootstrap flag.

  • StartService(true) for RocksDB and (FLAGS_bootstrap || is_single_node) elsewhere: confirm the bool means “create if missing / bootstrap mode.”
  • DataStoreServiceClient(bootstrap_flag, …): confirm the flag drives lazy connect and topology init as intended when not bootstrap.

Also applies to: 1201-1206


2502-2526: MOVED error now keyed by slot_id (LGTM).

Range check and leader endpoint selection are correct; UNKNOWN fallback is acceptable.


2589-2613: slot_id = hash & 0x3fff matches 16384-slot cluster.

Both single- and multi-key paths compute consistent slot_ids.

@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch from 4f9b323 to ac48184 Compare November 17, 2025 07:52
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/redis_service.cpp (1)

2286-2315: Slot mapping correctness fixed; prefer named constants and (optional) no-16× expansion.

  • Good: end_slot_range is initialized; sort then coalesce ranges.
  • Readability: replace magic 16/1024 with constants.
  • Perf: the previously suggested per-layer approach avoids building a 16×-larger vector. See earlier sketch; apply if needed.
-        assert(txservice::total_range_buckets == 1024);
-        std::vector<uint32_t> slot_ids;
-        // slots count is 16 times of bucket count
-        slot_ids.reserve(bucket_ids.size() * 16);
-        for (uint16_t bucket_id : bucket_ids)
-        {
-            for (int i = 0; i < 16; i++)
-            {
-                slot_ids.push_back(bucket_id + i * 1024);
-            }
-        }
+        constexpr uint16_t kBucketStride = 1024;   // anchored by assert
+        constexpr int kSlotsPerBucket = 16;
+        assert(txservice::total_range_buckets == kBucketStride);
+        std::vector<uint32_t> slot_ids;
+        slot_ids.reserve(bucket_ids.size() * kSlotsPerBucket);
+        for (uint16_t bucket_id : bucket_ids) {
+            for (int i = 0; i < kSlotsPerBucket; i++) {
+                slot_ids.push_back(static_cast<uint32_t>(bucket_id) +
+                                   i * kBucketStride);
+            }
+        }
🧹 Nitpick comments (5)
src/redis_command.cpp (1)

1678-1679: Hash semantics verified; consider factoring magic number across codebase.

EloqKey::Hash() correctly implements CRC16-XMODEM with Redis hash tag support, properly handling {…} patterns as per cluster spec. The mask 0x3fff correctly produces slot range [0, 16383].

Minor refactoring opportunity: The magic number 0x3fff appears in at least three locations (src/redis_command.cpp:1678, src/redis_service.cpp:2589, src/redis_service.cpp:2608). Consider extracting a shared constant to improve maintainability:

static constexpr uint16_t kClusterSlotMask = 0x3fff;
src/redis_service.cpp (4)

1076-1149: Bootstrap leader mapping and config derivation — verify intent; consider small refactor.

  • Bootstrap maps every NG leader to this node (tmp_node_id). Confirm this is the intended “initialize all shards locally” behavior even if the node appears in only one NG in the config.
  • Minor: the “load-or-parse ng_configs” code is duplicated here; extract a helper to keep the two call sites in sync.

1188-1192: StartService create-if-missing gating looks right.
Non-shared storage: always true; shared: (bootstrap || single-node). Consider adding an INFO log clarifying the flag used for easier ops debugging.

-        ret = data_store_service_->StartService(FLAGS_bootstrap ||
-                                                is_single_node);
+        bool create_if_missing = (FLAGS_bootstrap || is_single_node);
+        DLOG(INFO) << "DataStoreService StartService(create_if_missing="
+                   << (create_if_missing ? "true" : "false") << ")";
+        ret = data_store_service_->StartService(create_if_missing);

2502-2526: MOVED message: add defensive guard for empty hosts.
Rare, but if an ng had no online replicas between ClusterSlots and lookup, hosts.front() would UB. Guard and keep UNKNOWN fallback.

-        if (slot_info.start_slot_range <= slot_id &&
-            slot_info.end_slot_range >= slot_id)
-        {
-            error_msg.append(" ");
-            error_msg.append(slot_info.hosts.front().ip);
-            error_msg.append(":");
-            error_msg.append(std::to_string(slot_info.hosts.front().port));
-            return error_msg;
-        }
+        if (slot_info.start_slot_range <= slot_id &&
+            slot_info.end_slot_range >= slot_id) {
+            if (slot_info.hosts.empty()) {
+                LOG(WARNING) << "No hosts for slot range ["
+                             << slot_info.start_slot_range << "-"
+                             << slot_info.end_slot_range << "]";
+                break; // fall through to UNKNOWN
+            }
+            error_msg.append(" ");
+            error_msg.append(slot_info.hosts.front().ip);
+            error_msg.append(":");
+            error_msg.append(std::to_string(slot_info.hosts.front().port));
+            return error_msg;
+        }

2588-2594: Deduplicate slot mask literal; use a named constexpr.
Predeclare kRedisSlotMask = 0x3fff and use it at both call sites.

-                uint16_t slot_id = key->Hash() & 0x3fff;
+                uint16_t slot_id = key->Hash() & kRedisSlotMask;
@@
-                uint16_t slot_id = keys->at(0).Hash() & 0x3fff;
+                uint16_t slot_id = keys->at(0).Hash() & kRedisSlotMask;

Add once near file top (after SEC_LOCAL or another constants block):

// Redis slots mask (16384 slots)
constexpr uint16_t kRedisSlotMask = 0x3fff;

Also applies to: 2608-2613

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f9b323 and ac48184.

📒 Files selected for processing (9)
  • CMakeLists.txt (0 hunks)
  • concourse/scripts/build_tarball.bash (1 hunks)
  • concourse/scripts/common.sh (1 hunks)
  • include/redis_service.h (1 hunks)
  • src/redis_command.cpp (1 hunks)
  • src/redis_service.cpp (8 hunks)
  • store_handler (1 hunks)
  • tests/unit/mono/expire.tcl (1 hunks)
  • tx_service (1 hunks)
💤 Files with no reviewable changes (1)
  • CMakeLists.txt
✅ Files skipped from review due to trivial changes (1)
  • store_handler
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/unit/mono/expire.tcl
  • concourse/scripts/common.sh
  • tx_service
🧰 Additional context used
🧬 Code graph analysis (3)
include/redis_service.h (1)
src/redis_service.cpp (2)
  • GenerateMovedErrorMessage (2502-2526)
  • GenerateMovedErrorMessage (2502-2502)
src/redis_command.cpp (1)
include/redis_command.h (1)
  • key_ (3927-3929)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (5)
concourse/scripts/build_tarball.bash (1)

195-195: Partition flag removal aligns with PR objective.

The DSS build now omits the partition flag, consistent with the removal of USE_ONE_ELOQDSS_PARTITION_ENABLED across the codebase to support multi-shard sharding. The change is straightforward and correct.

Please confirm that the DSS project's CMakeLists.txt no longer defines or expects the USE_ONE_ELOQDSS_PARTITION_ENABLED option, ensuring the CMake configuration aligns with this removal.

include/redis_service.h (1)

242-242: Param rename aligns with slot_id terminology. LGTM.
Consistent with per-slot routing elsewhere.

src/redis_service.cpp (3)

838-865: Bootstrap solo-node ng_configs path — confirm NodeConfig semantics.
Using a single NG (id 0) with one candidate node is fine if downstream code expects a candidate as leader. Please confirm NodeConfig(0, ip, tx_port, true) implies a candidate/primary suitable for immediate bootstrap.


1070-1073: SetThisNode after peer-config fetch — good reorder.
Ensures self is registered against fetched topology before service start.


1201-1205: Verify DataStoreServiceClient constructor parameter order and semantics against store_handler submodule.

The store_handler dependency is not accessible in this environment. You must verify locally that the boolean arguments and parameter order at src/redis_service.cpp:1201-1205 match the constructor signature in store_handler/data_store_service_client.h to ensure:

  • FLAGS_bootstrap || is_single_node maps to the correct boolean parameter
  • eloq_dss_peer_node.empty() maps to the correct boolean parameter
  • All five parameters are in the intended order

@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch from ac48184 to ce2d49a Compare November 17, 2025 09:14
@lzxddz lzxddz requested review from githubzilla and liunyl November 17, 2025 09:16
@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch from b645b01 to c46acb1 Compare November 18, 2025 08:01
@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch from c46acb1 to 0ffac4a Compare November 18, 2025 09:56
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/redis_service.cpp (3)

2290-2319: Bucket→slot mapping logic is correct; consider guarding empty ng_leaders

The new code that:

  • asserts total_range_buckets == 1024,
  • maps each bucket_id into 16 slots via bucket_id + i * 1024 (0 ≤ i < 16),
  • sorts slot_ids, and
  • coalesces contiguous ranges into SlotPair{start_slot_range,end_slot_range}

correctly produces per‑slot ranges over 0–16383 and ensures end_slot_range is always initialized.

The only fragile part is bucket_owners.size() / node_group_cnt when ng_leaders is empty (would be a divide‑by‑zero). Callers should always pass a non‑empty leader map, but for defensive robustness you might either early‑return when ng_leaders.empty() or add an explicit DCHECK(!ng_leaders.empty()) before computing node_group_cnt.


2506-2530: GenerateMovedErrorMessage now operates on slot IDs; behavior matches slot ranges

Switching GenerateMovedErrorMessage to take slot_id, emit that in "MOVED <slot_id> host:port", and compare against SlotInfo::start_slot_range/end_slot_range aligns the MOVED reply with the new 0–16383 slot space returned by RedisClusterSlots. Semantics look correct.

If MOVED responses ever become hot, you might consider caching the last RedisClusterSlots result instead of recomputing it on every MOVED, but that’s an optional optimization.


2583-2621: MOVED slot computation uses 14‑bit slot IDs; ensure hashing stays consistent

In the DATA_NOT_ON_LOCAL / WRITE_REQUEST_ON_SLAVE path you now compute:

  • For single‑key: uint16_t slot_id = key->Hash() & 0x3fff;
  • For multi‑key: uint16_t slot_id = keys->at(0).Hash() & 0x3fff;

and feed that into GenerateMovedErrorMessage(slot_id).

Given GetNodeSlotsInfo maps each bucket to 16 slots as bucket_id + i * 1024, this 0x3fff mask is consistent with a 16384‑slot space where bucket_id == slot_id % 1024. As long as TxKey/Sharder also derive the routing bucket from the low 10 bits (or an equivalent total_range_buckets‑based function), MOVED will redirect clients to the same owner that actually serves the key.

It may be worth factoring this slot calculation into a small helper (e.g., ComputeRedisSlotIdFromHash) shared with ClusterKeySlotCommand to avoid magic constants and keep hashing consistent across call sites.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c46acb1 and 0ffac4a.

📒 Files selected for processing (3)
  • src/redis_service.cpp (8 hunks)
  • store_handler (1 hunks)
  • tx_service (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tx_service
🚧 Files skipped from review as they are similar to previous changes (1)
  • store_handler
🧰 Additional context used
🧬 Code graph analysis (1)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (2)
src/redis_service.cpp (2)

838-858: Bootstrap ng_configs path is straightforward and consistent

The bootstrap branch that seeds ng_configs with a single local candidate node cleanly separates bootstrap from file/ParseNgConfig loading and is consistent with the later scan for (local_ip, local_tx_port) when deriving node_id/native_ng_id. I don’t see correctness issues here.


1070-1148: DSS bootstrap/no‑peer wiring looks good; double‑check peer‑node flag polarity

The new logic for the ELOQDS path:

  • FetchConfigFromPeer + SetThisNode(TxPort2DssPort(local_tx_port)) when eloq_dss_peer_node is set, and
  • bootstrap/no‑peer flow that reloads tmp_ng_configs, finds tmp_node_id, and fills ng_leaders either for bootstrap (this node as leader for all ngs) or single‑node,

is coherent and matches the intended semantics where ng_leaders is empty in non‑bootstrap multi‑node cases and handled inside TxConfigsToDssClusterConfig. The StartService(true) for RocksDB and StartService(FLAGS_bootstrap || is_single_node) for other backends also matches the “create if missing” vs. “attach to existing” story.

One thing to verify: in the DataStoreServiceClient constructor you pass eloq_dss_peer_node.empty() as the boolean after ds_config. If that parameter is conceptually “peer node present” (as suggested by the summary), it may want !eloq_dss_peer_node.empty() instead. Please confirm the expected meaning and adjust if necessary.

Also applies to: 1188-1192, 1201-1209

@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch from 0ffac4a to cb803dd Compare November 18, 2025 10:06
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/redis_service.cpp (2)

2290-2319: Bucket→slot expansion and slot range compaction are correct for 16×1024=16384 slots

The new mapping from bucket_ids to per-slot IDs:

  • Uses assert(txservice::total_range_buckets == 1024) and slot = bucket_id + i * 1024 (i ∈ [0,15]), which correctly covers all 16384 Redis cluster slots without overlap.
  • Sorts slot_ids and then builds contiguous [start_slot_range, end_slot_range] SlotPairs; the early bucket_ids.empty() check and initialization of end_slot_range to slot_ids.front() avoid the previous single-slot/uninitialized issues.
  • Is only on the CLUSTER NODES/SLOTS path, so the extra temporary allocation and sort are acceptable.

If you ever change total_range_buckets, consider deriving the 16 multiplier from it (e.g., a constexpr slots_per_bucket or a named kRedisClusterSlots) to keep the relationship between buckets and slots self-documenting, but this is not required for correctness.


2506-2529: MOVED redirection now uses real cluster slot IDs consistent with CLUSTER SLOTS

The MOVED handling has been brought into line with Redis cluster semantics:

  • GenerateMovedErrorMessage(uint16_t slot_id) now works in terms of slot IDs and matches them against SlotInfo.start_slot_range/end_slot_range from RedisClusterSlots, then returns "MOVED <slot_id> host:port".
  • Callers compute slot_id as Hash() & 0x3fff for both single-key and multi-key (first-key) requests, which correctly maps into the 0–16383 slot space expected by clients and by the bucket→slot layout in GetNodeSlotsInfo.

This should make MOVED responses fully consistent with what clients see from CLUSTER SLOTS and CLUSTER NODES. If you want to avoid magic numbers, you could later introduce a named constant for the slot mask (e.g., kRedisClusterSlotMask = 0x3fff) and reuse it here and in the command code, but the current form is functionally sound.

Also applies to: 2593-2598, 2612-2617

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0ffac4a and cb803dd.

📒 Files selected for processing (9)
  • CMakeLists.txt (0 hunks)
  • concourse/scripts/build_tarball.bash (1 hunks)
  • concourse/scripts/common.sh (1 hunks)
  • include/redis_service.h (1 hunks)
  • src/redis_command.cpp (1 hunks)
  • src/redis_service.cpp (8 hunks)
  • store_handler (1 hunks)
  • tests/unit/mono/expire.tcl (1 hunks)
  • tx_service (1 hunks)
💤 Files with no reviewable changes (1)
  • CMakeLists.txt
✅ Files skipped from review due to trivial changes (1)
  • tx_service
🚧 Files skipped from review as they are similar to previous changes (6)
  • concourse/scripts/common.sh
  • include/redis_service.h
  • store_handler
  • tests/unit/mono/expire.tcl
  • concourse/scripts/build_tarball.bash
  • src/redis_command.cpp
🧰 Additional context used
🧬 Code graph analysis (1)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (2)
src/redis_service.cpp (2)

838-865: Bootstrap-only ng_configs initialization is consistent and self-contained

Constructing ng_configs as a single local node group when FLAGS_bootstrap is true cleanly isolates bootstrap from the persisted cluster topology and aligns with the later uses of node_id, native_ng_id, and ng_configs in this method. No functional issues spotted here.


1071-1148: ELOQDS bootstrap/multi-node wiring and create-if-missing flags look coherent; double-check NodeId expectations

The DSS paths behave coherently:

  • Peer mode: FetchConfigFromPeer followed by ds_config.SetThisNode(local_ip, TxPort2DssPort(local_tx_port)) correctly informs DSS which physical node this process runs on.
  • Bootstrap without peer: rebuilding tmp_ng_configs, locating tmp_node_id, and populating ng_leaders for all ng_id before calling TxConfigsToDssClusterConfig matches the intent of seeding DSS shards for the full cluster during bootstrap.
  • Non-bootstrap: passing an empty ng_leaders map for multi-node and a fully-populated one for single-node is consistent with the earlier discussion that multi-node startup should not pre-assign DSS leaders.
  • The create_if_missing behavior is now aligned between DataStoreService::StartService(...) and DataStoreServiceClient’s first boolean (true for local RocksDB, FLAGS_bootstrap || is_single_node otherwise), which avoids subtle mismatches.

One thing worth explicitly confirming in comments/design docs: in bootstrap mode, tx_service_ uses node_id derived from the synthetic ng_configs (always 0 here), while DSS bootstrap uses tmp_node_id from tmp_ng_configs. If these node IDs are intentionally independent identifiers across layers, this is fine; if not, consider either seeding ng_configs from tmp_ng_configs or adding a short comment clarifying the separation.

Also applies to: 1188-1192, 1201-1209

@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch from cb803dd to 4a932e5 Compare November 18, 2025 11:28
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/redis_command.cpp (1)

1675-1679: Cluster slot computation relies on hash semantics and a magic mask

result_ = key_.Hash() & 0x3fff; will give values in [0, 16383], which matches the 16384 Redis cluster slots if EloqKey::Hash() is the canonical cluster hash. If any other code computes slot_id differently (e.g., % 16384, or from a different hash function), MOVED errors / routing could diverge.

Two small suggestions:

  • Consider replacing 0x3fff with a named constant (e.g., kRedisClusterSlotsMask or (kRedisClusterSlots - 1)) to make the 16384‑slot intent explicit.
  • Double‑check that all callers and sharding logic use the same key_.Hash() + 14‑bit mask convention for slot_id.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cb803dd and 4a932e5.

📒 Files selected for processing (9)
  • CMakeLists.txt (0 hunks)
  • concourse/scripts/build_tarball.bash (1 hunks)
  • concourse/scripts/common.sh (1 hunks)
  • include/redis_service.h (1 hunks)
  • src/redis_command.cpp (1 hunks)
  • src/redis_service.cpp (8 hunks)
  • store_handler (1 hunks)
  • tests/unit/mono/expire.tcl (1 hunks)
  • tx_service (1 hunks)
💤 Files with no reviewable changes (1)
  • CMakeLists.txt
🚧 Files skipped from review as they are similar to previous changes (5)
  • tests/unit/mono/expire.tcl
  • concourse/scripts/common.sh
  • store_handler
  • tx_service
  • include/redis_service.h
🧰 Additional context used
🧬 Code graph analysis (1)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (6)
concourse/scripts/build_tarball.bash (1)

195-195: Align with multi-shard DSS initialization: approved with verification request.

The removal of -DUSE_ONE_ELOQDSS_PARTITION_ENABLED=OFF is consistent with the PR objective to deprecate this build flag. The DSS build now relies on CMake default behavior for partition handling, which should support the multi-shard initialization mode.

Please confirm that the default CMake value for the partition flag (no longer explicitly set to OFF) is appropriate for multi-shard DSS deployments. Additionally, verify that this change is coordinated with corresponding updates in CMakeLists.txt mentioned in the enriched summary.

src/redis_service.cpp (5)

838-865: Bootstrap ng_configs initialization matches single-node bootstrap semantics

Using a solo NodeConfig for ng_configs under FLAGS_bootstrap keeps the bootstrap path independent of the persisted cluster config and still allows node_id/native_ng_id to be resolved correctly later. This aligns with the intent of “bootstrap only local txservice/logservice” and looks sound.


1060-1148: Confirm DSS bootstrap/no‑peer topology behavior

The new DSS init logic looks reasonable overall (peer path uses FetchConfigFromPeer + SetThisNode, non‑peer path uses TxConfigsToDssClusterConfig with either tmp_ng_configs for bootstrap or ng_configs plus ng_leaders for single/multi‑node).

One design point worth double‑checking: in the bootstrap/no‑peer branch, ng_leaders is populated with tmp_node_id for all tmp_ng_configs entries, even for node groups where this node may not be a member. This assumes TxConfigsToDssClusterConfig(tmp_node_id, tmp_ng_configs, ng_leaders, ds_config) interprets ng_leaders[ng_id] purely as “which node should own DSS shards for ng_id during bootstrap”, not “leader must belong to that ng”. If that assumption ever changes, it might be safer to only assign ng_leaders[ng_id] = tmp_node_id for groups that actually contain this node.

Please confirm that this matches the intended multi‑DSS‑shard bootstrap design and that TxConfigsToDssClusterConfig is robust to this mapping.


1188-1192: StartService and DataStoreServiceClient flags are coherent

The updated calls

  • data_store_service_->StartService(true) for RocksDB (always create if missing on local non‑shared storage), and
  • data_store_service_->StartService(FLAGS_bootstrap || is_single_node) for other ELOQDS backends,

together with the matching bootstrap/single‑node flag passed into DataStoreServiceClient, line up with the intended behavior: only bootstrap or true single‑node deployments auto‑initialize storage locally, while multi‑node, non‑bootstrap nodes rely on existing topology or a peer.

This looks internally consistent with the new API surface.

Also applies to: 1201-1209


2290-2319: Bucket→slot expansion and slot range construction look correct

The new logic that:

  • asserts txservice::total_range_buckets == 1024,
  • expands each bucket to 16 slots via slot_id = bucket_id + i * 1024 (yielding 16384 slots total),
  • sorts slot_ids, and
  • builds contiguous SlotPair ranges while initializing both start_slot_range and end_slot_range (which also fixes the single‑slot case),

matches the 16384‑slot Redis Cluster model and should interoperate cleanly with RedisClusterNodes/RedisClusterSlots consumers.

No functional issues spotted here.


2506-2529: MOVED error now keyed by full 14‑bit slot id – verify hash/slot consistency

Updating GenerateMovedErrorMessage to take a slot_id and computing it as key->Hash() & 0x3fff (0–16383) aligns MOVED replies with Redis’ 16384‑slot addressing and the new per‑slot SlotPair ranges.

The only assumption to confirm is that:

  • the same underlying hash used by Sharder/GetAllBucketOwners to assign buckets corresponds to this Hash() value, and
  • the bucket→slot expansion in GetNodeSlotsInfo is consistent with slot_id = Hash() & 0x3fff (i.e., that Hash() exposes the full CRC16‑style hash, not an already‑reduced bucket index).

If Hash() was historically returning a reduced bucket id, this change would alter the advertised slot in MOVED errors without changing actual routing. Please verify that TxKey::Hash() (and related code) now returns a value appropriate for 16384‑slot hashing so that MOVED slot ids and cluster slot maps stay in sync.

Also applies to: 2593-2598, 2613-2617

@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch from 4a932e5 to 1c03ea4 Compare November 20, 2025 12:01
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
tests/unit/mono/expire.tcl (1)

152-160: PTTL tolerance change is reasonable; consider tightening comment wording.

Relaxing the upper bound to <= 1100 nicely accounts for cross-node clock skew between the SETEX node and the PTTL node while still catching gross errors. The explanatory comment is helpful; if you touch this again, you might slightly rephrase it (e.g., “SETEX computes the expire timestamp on the client’s node, while PTTL uses the key’s owner node; their clocks may differ”) but there’s no functional issue here.

src/redis_service.cpp (2)

841-868: Bootstrap and ELOQDS multi‑shard wiring look consistent; consider documenting the empty ng_leaders contract.

  • The new FLAGS_bootstrap branch that seeds ng_configs with a single local node (ng_id 0, node_id 0) is coherent with the later early‑exit bootstrap path and keeps bootstrap independent of any pre‑existing cluster_config file.
  • In the ELOQDS block:
    • Peer mode (eloq_dss_peer_node non‑empty) correctly pulls topology from the peer and then calls ds_config.SetThisNode(local_ip, TxPort2DssPort(local_tx_port)), which is what you want for joining an existing DSS cluster.
    • The bootstrap/no‑peer path reloads full tmp_ng_configs, finds the local tmp_node_id, marks it leader for all ng_ids in ng_leaders, and feeds that into TxConfigsToDssClusterConfig. That matches the goal of “bootstrap all DSS shards on this node.”
    • The non‑bootstrap path only populates ng_leaders for is_single_node, leaving it empty for multi‑node, which aligns with your earlier note that multi‑node non‑bootstrap should pass an empty map and let DSS resolve shard leadership itself.
  • The booleans for StartService(true) vs StartService(FLAGS_bootstrap || is_single_node) and the first parameter to DataStoreServiceClient are internally consistent and reflect the shared‑vs‑non‑shared storage distinction.

I’d only suggest adding a short comment where ng_leaders is intentionally left empty in the multi‑node non‑bootstrap case to encode that contract for future readers; behavior otherwise looks correct.

Also applies to: 1081-1159, 1202-1205, 1215-1223


2304-2333: Slot mapping and MOVED error generation are consistent with 16K‑slot Redis semantics.

  • GetNodeSlotsInfo now maps each bucket to 16 slots (bucket_id + i*1024 with total_range_buckets == 1024), sorts the resulting slot_ids, and collapses contiguous sequences into SlotPair ranges. That yields a clean 0–16383 slot space and supports multi‑shard leaders (one node leading multiple ngs) correctly.
  • MOVED handling uses slot_id = Hash() & 0x3fff for both single‑key and multi‑key requests, and GenerateMovedErrorMessage(uint16_t slot_id) searches the RedisClusterSlots output for a range containing that slot. Given CLUSTER KEYSLOT is using the same Hash() & 0x3fff convention, this keeps MOVED, CLUSTER SLOTS, and CLUSTER KEYSLOT aligned.
  • Using uint16_t for slot_id is safe here since the mask restricts values to [0, 16383].

If you want extra observability, you could log the slot_id in the “slot not found” error path, but functionally the changes look solid.

Also applies to: 2520-2531, 2607-2631

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4a932e5 and 1c03ea4.

📒 Files selected for processing (9)
  • CMakeLists.txt (0 hunks)
  • concourse/scripts/build_tarball.bash (1 hunks)
  • concourse/scripts/common.sh (1 hunks)
  • include/redis_service.h (1 hunks)
  • src/redis_command.cpp (1 hunks)
  • src/redis_service.cpp (8 hunks)
  • store_handler (1 hunks)
  • tests/unit/mono/expire.tcl (1 hunks)
  • tx_service (1 hunks)
💤 Files with no reviewable changes (1)
  • CMakeLists.txt
🚧 Files skipped from review as they are similar to previous changes (4)
  • include/redis_service.h
  • concourse/scripts/build_tarball.bash
  • tx_service
  • src/redis_command.cpp
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-09-26T05:07:45.185Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: concourse/scripts/build_tarball_open.bash:121-121
Timestamp: 2025-09-26T05:07:45.185Z
Learning: The open log service in build_tarball_open.bash only supports ROCKSDB as the single log state option, unlike the main eloq log service which supports multiple WITH_LOG_STATE options. Therefore, no WITH_LOG_STATE configuration flag is needed for the open log service build.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T09:00:44.250Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: build_log_service.cmake:103-118
Timestamp: 2025-09-24T09:00:44.250Z
Learning: The build_log_service.cmake file is specifically for building the log_service submodule which only supports ROCKSDB log state, unlike build_eloq_log_service.cmake which handles multiple WITH_LOG_STATE options (MEMORY, ROCKSDB, ROCKSDB_CLOUD_S3, ROCKSDB_CLOUD_GCS). Therefore, requiring RocksDB unconditionally in build_log_service.cmake is the correct behavior.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T08:45:08.818Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: build_log_service.cmake:128-135
Timestamp: 2025-09-24T08:45:08.818Z
Learning: The LOG_STATE_TYPE_RKDB compile definition in build_log_service.cmake is kept unconditional for compatibility reasons, despite architectural concerns about conflicting with WITH_LOG_STATE paradigm.

Applied to files:

  • concourse/scripts/common.sh
📚 Learning: 2025-09-24T08:53:57.928Z
Learnt from: githubzilla
Repo: eloqdata/eloqkv PR: 193
File: src/redis_service.cpp:94-118
Timestamp: 2025-09-24T08:53:57.928Z
Learning: The LOG_STATE_TYPE_RKDB_CLOUD macro gating in src/redis_service.cpp includes `&& !defined(LOG_STATE_TYPE_RKDB)` condition intentionally to prevent conflicts when LOG_STATE_TYPE_RKDB is defined unconditionally in some builds (build_log_service.cmake) alongside cloud variants. The CMake configuration in build_eloq_log_service.cmake ensures mutual exclusivity of log state types.

Applied to files:

  • concourse/scripts/common.sh
🧬 Code graph analysis (1)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (3)
concourse/scripts/common.sh (3)

142-154: Removal of USE_ONE_ELOQDSS_PARTITION_ENABLED flag is correctly applied.

The cmake invocation in run_build now removes the -DUSE_ONE_ELOQDSS_PARTITION_ENABLED=ON flag as part of the multi-shard support changes. The -DWITH_LOG_SERVICE=ON flag is correctly retained. No concerns with this change.


203-217: Enterprise build function correctly mirrors flag removal.

The run_build_ent function similarly removes the -DUSE_ONE_ELOQDSS_PARTITION_ENABLED=ON flag while preserving the enterprise-specific configuration flags (OPEN_LOG_SERVICE=OFF, FORK_HM_PROCESS=ON) and -DWITH_LOG_SERVICE=ON.


2970-2980: Clarify the re-enablement of previously disabled tests.

Lines 2970 and 2979 now execute branching_snapshot_test and rocksdb_cloud_data_store_purger_test for the ELOQDSS_ROCKSDB_CLOUD_S3 storage type. Previous reviews flagged these as disabled. Please add a brief comment explaining:

  • Why these tests were originally disabled (e.g., multi-shard incompatibility).
  • What changed to make them compatible again (e.g., specific fixes or test updates in this PR).

If intentional, this is good; if accidental, these lines may need to remain commented. A clarifying comment will help future maintainers understand the decision.

store_handler Outdated
@@ -1 +1 @@
Subproject commit 87ee52bf0672fee7c8f4b190c93b99cdfcb1bb37
Subproject commit dab3aa8130da6cd414ed6623c24da42e28aabb10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the submodule update aligns with multi-shard DSS changes.

The submodule pointer has been updated. Ensure that the changes in the store_handler submodule are compatible with the multi-shard DSS support and related Redis service layer modifications in this PR.

Run the following script to verify the submodule update:


🏁 Script executed:

#!/bin/bash
# Description: Verify the store_handler submodule commit exists and review the changes

# Check if the new commit exists in the submodule
cd store_handler 2>/dev/null || {
  echo "Error: store_handler directory not found"
  exit 1
}

# Verify the new commit exists
NEW_COMMIT="dab3aa8130da6cd414ed6623c24da42e28aabb10"
if git cat-file -e "$NEW_COMMIT^{commit}" 2>/dev/null; then
  echo "✓ New commit $NEW_COMMIT exists"
else
  echo "✗ New commit $NEW_COMMIT not found in store_handler"
  exit 1
fi

# Show the commit log between old and new commits
OLD_COMMIT="87ee52bf0672fee7c8f4b190c93b99cdfcb1bb37"
echo ""
echo "Changes in store_handler submodule:"
git log --oneline "$OLD_COMMIT..$NEW_COMMIT" 2>/dev/null || {
  echo "Unable to compare commits. Ensure the submodule is initialized and up-to-date."
}

# Show files changed in the submodule
echo ""
echo "Files changed in store_handler:"
git diff --name-status "$OLD_COMMIT" "$NEW_COMMIT" 2>/dev/null

Length of output: 80


🏁 Script executed:

#!/bin/bash
# Description: Investigate store_handler submodule status and commit validity

# Check submodule status and configuration
echo "=== Submodule Configuration ==="
git config -f .gitmodules --get submodule.store_handler.path
git config -f .gitmodules --get submodule.store_handler.url

echo ""
echo "=== Submodule Status ==="
git submodule status store_handler 2>/dev/null || echo "Submodule status unavailable"

echo ""
echo "=== Checking Remote Repository ==="
REMOTE_URL=$(git config -f .gitmodules --get submodule.store_handler.url)
if [ -n "$REMOTE_URL" ]; then
  echo "Remote URL: $REMOTE_URL"
  NEW_COMMIT="dab3aa8130da6cd414ed6623c24da42e28aabb10"
  echo "Checking if commit exists in remote:"
  git ls-remote "$REMOTE_URL" "$NEW_COMMIT" 2>/dev/null || {
    echo "Unable to query remote. Checking if commit appears in refs:"
    git ls-remote "$REMOTE_URL" | grep "$NEW_COMMIT" || echo "Commit not found in remote refs"
  }
else
  echo "Unable to determine remote URL"
fi

echo ""
echo "=== Current store_handler state (if initialized) ==="
if [ -d store_handler/.git ]; then
  cd store_handler
  echo "Current HEAD:"
  git rev-parse HEAD 2>/dev/null || echo "Unable to determine HEAD"
  echo "Remotes:"
  git remote -v 2>/dev/null || echo "No remotes configured"
  cd ..
else
  echo "Submodule not initialized locally"
fi

Length of output: 392


Submodule update points to non-existent commit - cannot proceed.

The store_handler submodule is being updated to commit dab3aa8130da6cd414ed6623c24da42e28aabb10, but this commit does not exist in the remote repository (https://github.com/eloqdata/store_handler.git). This would cause repository checkout and CI/CD failures.

Verify that:

  • The commit hash is correct
  • The changes have been pushed to the store_handler repository
  • The submodule pointer is updated to a valid, existing commit before merging
🤖 Prompt for AI Agents
In store_handler around lines 1 to 1, the submodule pointer references commit
dab3aa8130da6cd414ed6623c24da42e28aabb10 which does not exist in
https://github.com/eloqdata/store_handler.git; to fix, verify the correct commit
hash in that remote (or push the missing commit), update the submodule pointer
to a valid existing commit (git submodule sync; cd store_handler; git fetch &&
git checkout <valid-commit>), commit the updated .gitmodules/.gitlinks in the
parent repo, and re-run git submodule update --init --recursive to ensure CI can
checkout the submodule.

Update bucket map to key-slot for total_bucket_count changed to 1024

Remove compile option USE_ONE_ELOQDSS_PARTITION_ENABLED

Only use one node group to start logservice and txservice on bootstrap

Update ClusterKeySlot command

Dss client add arg bind_data_shard_with_ng

Update test case expire.tcl
@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch from 1c03ea4 to 6c87cd4 Compare November 20, 2025 12:10
@lzxddz lzxddz force-pushed the dss_multi_shard-rebase branch from 6c87cd4 to e89f616 Compare November 21, 2025 03:26
@lzxddz lzxddz merged commit 7259173 into eloqdata:main Nov 21, 2025
3 of 5 checks passed
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/redis_command.cpp (1)

1676-1680: Cluster keyslot now maps directly to 16K slots; consider de‑duplicating the magic constant

result_ = key_.Hash() & 0x3fff; correctly constrains the slot to [0, 16383], which matches the hard‑coded 16384 slot count reported in ClusterInfoCommand. To keep things consistent and self‑documenting, consider introducing a shared constexpr (e.g. kRedisClusterSlots = 16384 and kRedisClusterSlotMask = kRedisClusterSlots - 1) and using it both here and in the cluster info strings instead of repeating 0x3fff/16384.

src/redis_service.cpp (1)

2304-2333: Per-slot mapping from buckets looks logically correct; consider minor robustness and perf tweaks

The new mapping:

  • Asserts txservice::total_range_buckets == 1024,
  • Expands each bucket id b into 16 slots b + i * 1024 (0 ≤ i < 16),
  • Sorts slot_ids and compresses them into contiguous [start_slot_range,end_slot_range] pairs,

is consistent with a 16384-slot Redis cluster and should cover each slot exactly once per owning node, with correct handling of the single-slot case via initialization of both start_slot_range and end_slot_range.

Two optional improvements you may want to consider:

  • Robustness to future callers: if ng_leaders could ever be empty outside the current call sites, an early if (ng_leaders.empty()) return; would guard the bucket_owners.size() / node_group_cnt reserve against a regression back to division-by-zero if this function gets reused.
  • Performance: instead of the nested for (ng_leader) { for (bucket_owners) { ... } }, you can do a single pass over bucket_owners, look up the leader via ng_leaders.at(pair.second), and append the bucket to that leader’s vector. That reduces this block from O(|ng_leaders| * |bucket_owners|) to O(|bucket_owners|) with simpler control flow.

Functionally it looks fine as-is; these would just make it more future-proof and efficient.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c03ea4 and e89f616.

📒 Files selected for processing (9)
  • CMakeLists.txt (0 hunks)
  • concourse/scripts/build_tarball.bash (1 hunks)
  • concourse/scripts/common.sh (1 hunks)
  • include/redis_service.h (1 hunks)
  • src/redis_command.cpp (1 hunks)
  • src/redis_service.cpp (8 hunks)
  • store_handler (1 hunks)
  • tests/unit/mono/expire.tcl (1 hunks)
  • tx_service (1 hunks)
💤 Files with no reviewable changes (1)
  • CMakeLists.txt
✅ Files skipped from review due to trivial changes (1)
  • tx_service
🚧 Files skipped from review as they are similar to previous changes (5)
  • concourse/scripts/build_tarball.bash
  • tests/unit/mono/expire.tcl
  • concourse/scripts/common.sh
  • store_handler
  • include/redis_service.h
🧰 Additional context used
🧬 Code graph analysis (1)
src/redis_service.cpp (1)
include/redis_service.h (1)
  • node_id (137-137)
🔇 Additional comments (3)
src/redis_service.cpp (3)

841-868: Bootstrap-only ng_configs initialization is consistent with single-node bootstrap semantics

Special-casing FLAGS_bootstrap to synthesize a single-node ng_configs entry for the local node, while keeping the existing config-file / ParseNgConfig path for non-bootstrap, looks coherent and keeps the rest of the initialization logic unchanged. No issues from my side here.


1081-1159: DSS bootstrap/topology wiring and create_if_missing flags look correct, but double-check intended leadership mapping

The new logic to:

  • Use SetThisNode() when bootstrapping from a peer, and
  • In the no-peer case, build ds_config via TxConfigsToDssClusterConfig with:
    • tmp_ng_configs + ng_leaders[ng_id] = tmp_node_id during FLAGS_bootstrap, and
    • ng_leaders populated only for is_single_node (left empty for multi-node),

plus threading FLAGS_bootstrap || is_single_node consistently into StartService() and the DataStoreServiceClient ctor, matches the multi-shard design and avoids creating data on non-leaders in regular multi-node startup.

One thing worth a quick sanity check: during bootstrap, ng_leaders is filled with the same tmp_node_id for every ng_id. If TxConfigsToDssClusterConfig assumes that ng_leaders[ng_id] must be a node that actually belongs to that ng_id in tmp_ng_configs, please confirm this “local node leads all ngs for bootstrap” mapping is explicitly supported there.

Also applies to: 1198-1206, 1215-1223


2520-2531: Slot-based MOVED error generation aligns with 16384 slots; confirm Hash() consistency with sharding

Renaming GenerateMovedErrorMessage to take a slot_id and computing that as key->Hash() & 0x3fff (for both single-key and multi-key requests) lines up with the Redis 0–16383 slot space and with the per-slot ranges produced by RedisClusterSlots / GetNodeSlotsInfo. The range check in GenerateMovedErrorMessage over slot_info.start_slot_range / end_slot_range should therefore send clients to the correct master.

To be completely sure there’s no subtle mismatch, please confirm that:

  • The same Hash() value (or at least its low bits) is what Sharder / bucket assignment uses, so that slot_id % 1024 matches the bucket ID used when building bucket_owners in GetNodeSlotsInfo.

If that holds, this MOVED-path change looks good.

Also applies to: 2607-2631

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants